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Abstract. A longitudinal study of lour- and nine- 
month-old infants indicates that they coordinate the tim¬ 
ing of their vocal behavior with that of their mothers and 
vice versa. Maternal interactions of Down-syndrome 
and nondelayed infants were analyzed and found not to 
differ with regard to such temporal coordination, indi¬ 
cating that it is independent of level of cognitive func¬ 
tioning. The capacity for coordinated timing is proposed 
as a mechanism for the facilitation of social interaction. 
Such coordination parallels temporal matching observed 
in a variety of species along the phylogenetic scale. 

Introduction 

Beginning at least with the work of the Gardeners 
(Gardner and Gardner, 1969: Gardner and Gardner, 
1974), researchers have explored the extent to which ani¬ 
mals can communicate as do human beings. Our re¬ 
search, on the other hand, has been concerned, in part, 
with the question of whether human social interaction is 
made possible, or facilitated by, capacities that are shared 
with other species and serve the same functions. We re¬ 
port here the results of a longitudinal study of the tempo¬ 
ral structure of social communication between nonde¬ 
layed and Down-syndrome infants in the first year of life 
and their mothers. The results suggest that coordinated 
interpersonal timing may serve as a mechanism for the 
facilitation of social interaction. We conclude that such 
timing shares features of functionally adaptive social pre¬ 
dispositions present in other species. 
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Conversation is the primary mode of conspecific com¬ 
munication employed by homo-sapiens. Such exchange 
is an important mechanism serving the organization and 
maintenance of human society. In this respect, conversa¬ 
tional exchange may be viewed as the functional ana¬ 
logue of the bird song and cricket chirp. While the infor¬ 
mation encoded in a chirp or a song sequence and a 
conversation may differ radically, the functional conse¬ 
quences of such a signal may be identical: facilitation of 
mating, bonding between infant and caretaker, guarding 
against predation, etc.. It is in this functional sense that 
we are considering a human vocal exchange as equiva¬ 
lent to vocal behaviors observed over a wide range of or¬ 
ganisms. There is an extensive body of evidence 
(Feldstein and Welkowitz, 1987) showing that conversa¬ 
tional exchange between adult speakers possesses a com¬ 
plex statistical temporal structure; a structure not en¬ 
tirely subsumed by the syntactic and semantic aspects of 
such an exchange. Of central interest to our investigation 
is coordinated interpersonal timing, which refers to an 
alteration in the temporal patterning of one speaker's be¬ 
havior as a function of that of the other speaker. 

Work with invertebrates, especially insects, and with 
simple vertebrate models, has begun to delineate a vari¬ 
ety of genetic and neurologic factors that are responsible 
for the temporal organization of social behavior. Thus, 
for example, investigators (Zerhring et ai, 1984; Ham¬ 
blen el ai, 1986) have isolated mutations mapped to a 
particular region of the X chromosome in Drosophila . 
Mutations on this locus increase, decrease, or destroy 
completely the temporal pattern of the male fly's mating 
song. A unique coding sequence that forms a portion of 
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this locus has recentlv been identified in several verte¬ 
brates (Schildherger. 1984). Groups of neurons that act 
as temporal filters have been identified in crickets 
(Schildherger I9 t s4). These filters are “tuned" to the 
temporal properties of the eonspecific song. Temporal 
filters sensitive to biologicallv salient stimuli have also 
been identified in several species of loads (Rose and Ca- 
pranica. 1984), the electric fish, cigenmannia (Partridge 
and Heiligenberg. 1981), and in rats (Rees and Moller, 
1983). 

The importance of a capacity lor temporal attunement 
in terms of the organism's survival is not to be underesti¬ 
mated. Zelick (1986) emphasized the crucial ecological 
function served by the temporal patterning of vocaliza¬ 
tion in certain frogs and the electric organ discharge in 
the weakly electric fish. Both fish and frogs have adopted 
similar strategies of signal oscillator timing to avoid sig¬ 
nal overlap and jamming between conspecifics. Lam- 
prcchl el at. (1985) detailed the utility of distance-call 
duets in bar-headed geese (Anscr indints). 

In all of this work, the important variable is the tempo¬ 
ral dimension of the signal. We emphasize that our inves¬ 
tigation is concerned with precisely this dimension and 
not with the linguistic process. Obviously, language ex¬ 
hibits a range of phenomena that possess important tem¬ 
poral features. However, our analysis is not concerned 
w ith elucidating the temporal patterns of different lin¬ 
guistic processes such as phonemes, vowel recognition, 
or other more “molecular" linguistic features. Our 
method of analysis is neutral with respect to the content 
of the acoustic signal. 

We wished to determine (a) whether human infants 
are attuned to the temporal properties of the vocal be¬ 
havior of their adult partners in a dyadic exchange and 
(b) the extent to which adults interacting with an infant 
are similarly responsive to the temporal characteristics 
of the infant's vocal behavior. Finally, we examined the 
possibility of a relationship between the capacity for tem¬ 
poral attunement and cognitive development. Given 
that the capacity for coordinated timing appears to be 
expressed bv organisms at various levels of the phyloge¬ 
netic scale, we expected it to be independent of cognitive 
functioning It was this conjecture that dictated our 
choice of infant* ith Down syndrome as one of our two 
groups of subje (Gibson. 1978). We note that other 
investigators haw ’i/ed the impairment of cognitive 


Down s\n<Jmme re v.-s 
vohes. among olher pro)' r 
average Menial Develops 
which is inllaled because no 
average for the nondelayed group 
svndrome. has rccenilv been chan t 
Down's equal]) acceptable. 


•neiically based condition that in- 
itive delay and/or retardation. The 
( the Down infant group was 64, 
w ^0 could be calculated; the 
Note that I he name, Down's 
make the use of Down or 


functioning of persons with Down syndrome to disen¬ 
tangle the role played by cognitive functioning in a vari¬ 
ety of human behaviors. Dow n syndrome oilers a valu¬ 
able window into the study of human development. The 
pattern of cognitive deficit associated with Down syn¬ 
drome is fairly well understood, and a number of studies 
(Ciccheti and Serafica. 1981; Ciccheti and Sroufe, 1976; 
Serafica and Ciccheti. 1976;Spiker. 1983) of Dovvn-syn- 
dromc infants and young children indicates that they 
show a pattern of delay rather than deficit in their social 
development. Workers such as Ciccheti and his col¬ 
leagues (Ciccheti and Serafica. 1981; Ciccheti and 
Sroufe. 1976) have, in fact, used Down syndrome as a 
model for studyingthe interaction of social and cognitive 
development. Our choice of infants afflicted with the 
syndrome was motivated by the same rationale. 

Materials and Methods 


Participants 

The participants were two groups of Caucasian 
mother-infant pairs. In one group of nine pairs, the in¬ 
fants were afflicted with Down syndrome (trisomy 21). 
Nine pairs of normal, nondelaved infants and their 
mothers comprised the other group. The infants with 
Down syndrome were recruited from community groups 
that provide services for such infants as well as from no¬ 
tices in the media. The nondelaved infants were re¬ 
cruited from notices placed in parent-child newsletters. 
All the mothers in the study were nativ e speakers of En¬ 
glish and were highschool graduates. At the initial ses¬ 
sion. each of the mothers was given a brief rating scale 
for depression (Rad I off. 1977) and another for anxiety 
(Zuekerman and Lubin. 1965). None of the mothers 
used in the study were found to be clinically anxious 
and/or depressed. A medical history was obtained from 
each of the mothers concerning her own health, a history 
of the pregnancy, and her infant's health. To the best of 
our knowledge, none of the infants with Down syndrome 
used in this study presented any relevant health prob¬ 
lems. 

Procedure 

The pairs were seen when the infants were within two 
weeks of being four months old, and within two weeks 
of being nine months old. On the second occasion, all 
the infants were given the Bayley Mental Development 
Scale (Bayley, 1969). Each of the mother-infant pairs en¬ 
gaged in a standard face-to-face play procedure for 12 
minutes in the Interpersonal Communications I abora- 
tory of the University of Maryland Baltimore County. At 
the four-month data point, the infant was seated in an 
infant seat directly across from its mother at an elev ation 
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such that mother and infant could comfortably achieve 
eye-contact. At the nine-month point, the infant was 
seated in an infant chair again oriented in such a way as 
to make face-io-face interaction comfortable. Also at the 
nine-month point, the mothers were given a small hand 
puppet to use as a means of focusing the interaction. This 
procedure is a standard one and has been used in similar 
studies (Jasnow and Feldstein, 1986). Two-channel, 12- 
minute tape recordings were madeofeach mother-infant 
interaction. To minimize the spill of one person's voice 
into the microphone of the other person, contact micro¬ 
phones were used. If during the course of an interaction, 
the infant became fussy, the taping continued for 30 sec¬ 
onds. If at the end of the 30 seconds the baby was not re¬ 
engaged, the taping was stopped until such time as the 
baby was able to continue. 

('ocaI analysis 

The coding of the vocal behavior was accomplished 
via the direct input of two audio signals, representing the 
infant and adult, into a specialized computer system 
known as the Automated Vocal Transaction Analyzer 
(A VTA) (Jaffe and Feldstein, 1970). A VTA is a hardware 
and software system. The hardware component is an an- 
alogue-to-digital converter that "listens" to two channels 
of incoming audio signals to determine whether the sig¬ 
nal in each channel is on or off. The audio signals repre¬ 
sent the vocal behavior of the two partners. Both se¬ 
quences of signals are sampled by the A-to-D converter 
every' 250 milliseconds and are stored digitally in the 
computer in the form of one sequence of four numbers: 
one signal is on and the other is off, or vice versa; both 
are on; or both are off. The AVTA system transforms the 
decimal numbers into the set of dialogic vocal parame¬ 
ters defined below and summarizes them as frequencies, 
proportions, average durations, and standard deviations 
for a fixed time interval. The time sampling interval was 
five seconds because it is approximately equal to the 
mean plus one standard deviation of each parameter, 
with the exception of the turn. Inasmuch as the maternal 
average speaking turns were longer than five seconds, the 
same criterion yielded 30 seconds as the appropriate 
sampling unit for the adult turn. 

The vocal parameters (Jaffe and Feldstein, 1970; 
Feldstein and Welkowitz, 1987) generated by the AVTA 
system are speaking turns . vocalizations, pauses, switch¬ 
ing pauses . and simultaneous speech. Simultaneous 
speech had too low a rate of occurrence to be included 
in these analyses. A turn begins the instant a participant 
starts to vocalize alone and ends immediately prior to the 
instant that the other participant starts to vocalize alone. 
A vocalization is a segment of sound uninterrupted by 


Infant 

F| lissj V1P 1 V 1 

P | V | P f71 SP 

H j * | 

Adull 

i v i 


\V\ p| v |sp; 

Time _i_i_i_i_i_i_i_i_i_i_i_i_i_i_i i i_i i_j_j_i j 


123456789 TO 

11 12 13 14 15 16 17 

18 19 20 21 22 23 


Figure 1 A diagramatic representation of a conversational se¬ 
quence. The numbered line at the bottom represents time in 250-ms 
units. V stands for vocalization. P for pause, and SP for switching pause 
(the silence that occurs immediately prior to a change in the speaking 
turn). The arrows that point dow n denote the end of the infant's turns; 
the arrows that point up denote the end of the adult’s turns. ISS and 
NSS stand for interntptive and noninterruptive simultaneous speech. 
respectively. (Adapted from Figure 11-2 of Jaffe and Feldstein, 1970). 

any discernible silence. A pause is an interval of joint 
silence that is initialed and terminated by vocalizations 
of the same participant. A switching pause is a joint si¬ 
lence initiated by the participant who has the turn and 
terminated by a vocalization of the other participant 
(Fig. 1). 

Statistical analyses 

The dyadic time scries was divided into five-second 2 
segments, or time units, yielding 144 five-second units 
(over 12 min). A time-series regression (TSR) analysis 
(Ostrom, 1978) was computed for each parameter to as¬ 
sess the occurrence of coordinated interpersonal timing 
for each mother-infant pair. The TSR was accomplished 
by a three-slep procedure. The series were first subjected 
to an AR1MA (SPSSX) modeling procedure for the pur¬ 
pose of "pre-w hitening" the data. The ACF subprogram 
of SPSSX T rends was used to allow for visual and statisti¬ 
cal checks to test which model parameters best fit to the 
data and met the assumptions made by the model. It was 
determined that the most useful parameter values were 
2, 0, 0. Each series was prewhitened separately. 

After the selection of the appropriate model, the TSR 
analyses were computed by the AREG subprogram of 
SPSSX Trends. It is the temporal coordination that oc¬ 
curs in the current 5- or 30-second sampling interval that 
was used in this report. In other words, we w^ere con¬ 
cerned with the degree to which changes in one series are 
reflected by changes in the other series within the same 
time frame. This relationship is indexed by the standard- 

2 The five-second time unit was used for the TSR analyses of all the 
parameters but maternal speaking turns. Average values w'ere com¬ 
puted for each parameter for every five seconds of interaction and for 
every 30 seconds in the case of maternal turns. Five seconds was chosen 
because it is approximately equal to the mean + 1 standard deviation 
for each parameter. Maternal speaking turns had a significantly greater 
mean value and thus 30 seconds was selected as a more appropriate 
time unit. 
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fable I 


Summary of Chi square anal \ u r s of the results of time-series regressions 
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* P> .05. 

Note. A is the (if for the Chi square. The R represents the average 
standardized partial regression coefficient. T stands for Turns. P for 
Pauses, SI' for Switching Pauses, V for Vocalizations. The “N" stands 
for “Nondelavcd," the “DS“ stands for “Dow n's s> ndrome." 


ized partial regression coefficient, which is used as a co¬ 
efficient ofcoordinated timing. 

We wanted two kinds of information. One was 
whether the group of dyads involving Down-syndrome 
infants and the group of dyads involving nondclayed in¬ 
fants each engaged in coordinated interpersonal timing. 
This information was provided by a meta-analytic ap¬ 
proach in which standard normal deviate scores are ob¬ 
tained for the probability values associated with the re¬ 
gression coefficients. Haeh of these standard scores is 
squared to yield a Chi square with one degree of freedom. 
The Chi squares are then summed for each group of dy¬ 
ads to provide a Chi square test (with df equal to the 
number ofChi squares in the sum) of whether the regres¬ 
sion coefficients in each group were significantly diHerein 
from zero. Another was whether the two groups differed 
in terms of the extent with which they engaged in coordi¬ 
nated timing. Differences between the two groups of 
mother-infant pairs (nondclayed and delayed) and be¬ 
tween the two age groups (four and nine months) were 
assessed by a split-plot analysis of variance. 

Results 

I he Chi square analyses of the results indicate that 
mutual coordination occurred (brail but one of the tem¬ 
poral parameters, at both 4 and 9 months, regardless of 
diagnosis (Table I). 

Given that the meta-anahtic results demonstrate that 


temporal coordination occurred across all but one of the 
vocal behaviors, or parameters, the question is whether 
the two groups can be discriminated on the basis of their 
magnitudes of coordination. Ihe analysis of variance of 
the pauses, switching pauses, and vocalizations yielded a 
significant main effect for diagnosis (F[ 1. 17] 4.34, P 

= .05, t = .40), indicating that the dyads with the delayed 
infants seemed to engage in less coordination than their 
nondelayed counterparts. However, the occurrence of a 
significant interaction of diagnosis by age (T[l, 17] 
= 4.34, P - .05. f = .40) indicates that the apparent gen¬ 
eral difference between the two groups is primarily attrib¬ 
utable to a significantly lower degree of coordination of 
the Down-syndrome dyads at four months of age. By the 
lime the delayed infants reach nine months of age. their 
average degree of coordination with their mothers is sim¬ 
ilar to that of the nondelayed dyads (Fig. 2). 

The results of the analysis of speaking turns (done sep¬ 
arately because of the larger sampling interval) provide 
no evidence of a difference in degree of coordination be¬ 
tween the dyads with the delayed infants and those with 
the nondelayed infants (/’[l. 17] = 0.00, P - .959). Nor 
did the magnitude of coordination of either group of dy¬ 
ads change markedly with time (/*'[!, 17] = 0.70, P 
= .415). 

Discussion 

The results offer support for the hypothesis that infants 
and their mothers coordinate the temporal organization 
of their vocal behavior both when the infants arc four 



AGE IN MONTHS 

Hi* it re 2. I he interaction of the diagnosis by age, indicating that 
whereas the degree ofcoordinated interpersonal timing of the dyads 
with nondelayed infant is similar when the infants arc four and nine 
months of age. that of the dyads with Dow n’s s> ndrome infant increases 
significant!^ from four to nine months. 
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months and nine months old. They demonstrate that the 
temporal phenomenon found to characterize adult con¬ 
versation (Partridge and Heiligenberg, 1981; Feldstein 
and Welkowitz, 1987) is present in adult-infant interac¬ 
tions from as early as four months of age and that the 
results are true not only for nondclayed infants, but also 
for infants afflicted with Down syndrome. Thus the 
study, having used a group about whose cognitive im¬ 
pairment there can be no doubt, represents a strong test 
of the proposition that coordinated interpersonal timing 
is independent of cognitive ability. 

Note, however, that although coordination appears to 
be a general phenomenon detected in both groups at 
both ages, the two groups could be discriminated on the 
basis of the lower degree of coordination exhibited by 
the Down-syndrome infants and their mothers at four 
months of age. This finding of lower coordination at four 
months increasing, by nine months, to a level similar to 
that of ihe nondelaved dyads, is consistent with the find¬ 
ings from a wide array of studies about the social behav¬ 
ior of Down-syndrome infants. These studies (Ciccheii 
and Sroufe, 1976; Serafica and Ciccheti, 1976; Ciccheti 
and Serafica, 1981; Spiker, 1983) have shown that dys¬ 
functional aspects of social behavior of Down-syndrome 
infants and young children are related to deviations in 
rate of development and not to deficits in development. 

The capacity to process and respond to the temporal 
patterning of human vocalizations may enable the infant 
to select and "lock onto” a biologically important envi¬ 
ronmental stimulus. That this capacity is present in in¬ 
fants suffering from severe cognitive impairment sug¬ 
gests that it may be buffered against insults to the organ¬ 
ism. 1 n other words, it may be that the capacity functions 
to make social interaction possible. The underlying neu¬ 
romechanisms responsible for such temporal sensitivity 
are not known. Workers such as Rose (1986) have ob¬ 
served that many different types of organisms employ the 
same set of neurons in the midbrain for processing cer¬ 
tain varieties of temporal information. Rose speculated 
that similar mechanisms may be operative in human be¬ 
ings. Zelick (1986) pointed out that the behavior strate¬ 
gics adopted by certain frogs and the weakly electric fish 
to avoid signal jamming are quite similar, and suggested 
that common neuromechanisms may be responsible for 
the common behavioral strategy. Whether the mecha¬ 
nisms underlying the behaviors described in this report 
are similar to those that operate in nonhuman organisms 
remains open to investigation. 

There is no doubt that the temporal patterning of so¬ 
cial interaction is a fundamental aspect of behavior in 
any given ecological setting. Marler and Terrace (1984) 
noted that "The mechanisms that underlie imprinting 
and song learning cannot be understood without first ac¬ 


knowledging the pervasive role of unlearned, function¬ 
ally adaptive predispositions to associate particular 
classes of stimuli” (p. 5). We conjecture that the respon- 
sivity to the temporal patterning of vocal behavior dem¬ 
onstrated by the findings presented here is an instance 
of such a functionally adaptive predisposition in human 
beings. 
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